Plant Phenomics — Latest Matching Preprints

1

GuavaVision AI: An Explainable Deep Learning Framework for Automated Classification, Lesion Localization, and Segmentation of Guava Diseases

Biswas, J.; Islam, M.; Bangabashi, M. M.; Akter, M.; Nishi, T. S.; Sheikh, M. K.; Mia, M. R.; Anwar, M. M.

2026-06-23 bioengineering 10.64898/2026.06.18.733093 medRxiv

Top 0.1%

53.5%

Show abstract

Guava cultivation is considerably influenced by foliar and fruit diseases whose overlapping symptoms and environmental variability make accurate field-level diagnosis challenging. Numerous studies have been conducted to find efficient methods of diagnosing plant diseases, but most focus on image-level classification and do not include lesion localization or pixel-level segmentation of the images within a single framework of analysis. This study proposes a comprehensive framework for utilizing automated image analysis to classify guava leaf and fruit diseases at the image level, locate lesions, and segment lesions at the pixel level from multiple images of the same type of disease collected from various growing conditions. The dataset was enriched through three augmentation strategies including standard preprocessing, structured augmentation, and GAN-based synthetic image generation, expanding the effective training data to approximately 7,000 images, while a 5-fold cross-validation strategy guided model selection and final performance was assessed on a held-out test set. The experimental evaluation of multiple state-of-the-art Convolutional Neural Networks (CNNs) for the classification of guava leaf and fruit diseases indicated that the model generated using the ResNet50+DenseNet121 model fusion achieved the highest classification accuracy of 98.20%. For lesion detection and segmentation, YOLOv8-seg outperformed Mask R-CNN, achieving mAP@0.5 of 0.907 and 0.889, and mAP@0.5:0.95 of 0.783 and 0.769 for detection and segmentation, respectively, with a balanced precision-recall profile. The techniques of Explainable AI (XAI) were used to increase the transparency of this model by identifying areas in the image that are significant to the actual lesion. The framework was further designed with practical web-based deployment in mind, evaluating both lightweight and high-capacity models to balance computational efficiency against predictive accuracy. From this research, it was concluded that using model fusion, data augmentation, and segmentation-aware lesion detection would provide a solution for managing guava diseases effectively.

2

Leaf movements as a quantitative metric for early stress detection

Herrero, E.; Wijeweera, S.; Gill, A. R.; Bampton, C.; Sullivan, W.; Stamford, J. D.; Bromley, J.; Antoniades, A. Z.; Mortimer, J. C.; Webb, A. A. R.; Gilliham, M.; Millar, A. H.

2026-07-08 plant biology 10.64898/2026.06.16.732190 medRxiv

Top 0.1%

17.8%

Show abstract

Early, precise, and non-destructive stress detection is essential for maintaining crop productivity, particularly in high-density plant growth systems like controlled environment agriculture (CEA), where manual monitoring is often impractical. Using plant motion as a proxy for growth and plant health, we demonstrate a method for early, non-invasive stress detection through quantitative leaf-movement analysis in lettuce and five other CEA relevant crops. Leaf-movement dynamics under stress were imaged with a low-cost, scalable Raspberry Pi imaging setup and quantified using a repurposed open-source motion estimation algorithm; Tracking Rhythms in Plants (TRiP). Our system detected stress-induced changes in leaf-movement within 1 hour of stress, with the timing dependent on the nature of the stress. Sustained reductions in leaf-movement coincide with decreased biomass accumulation. This approach offers a non-invasive, rapid, scalable, and cost-effective solution for continuous crop monitoring, with potential for application in both terrestrial and space farming CEA systems. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=138 SRC="FIGDIR/small/732190v1_ufig1.gif" ALT="Figure 1"> View larger version (54K): org.highwire.dtl.DTLVardef@19ee20eorg.highwire.dtl.DTLVardef@b0804org.highwire.dtl.DTLVardef@3b3fa8org.highwire.dtl.DTLVardef@1d04026_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOGraphical abstract:C_FLOATNO Quantification of leaf-movement dynamics as a high-throughput proxy for plant physiological status, enabling early stress detection and timely intervention to mitigate yield penalties in CEA settings (image made with biorender.org). C_FIG

3

DeepPheno: A Deep Learning Framework for Linking Hyperspectral Imaging and SNP Genotypes in Lettuce

Okyere, F. G. G.; Mehrem, S. L.; Snoek, B. L.; Van den Ackerveken, G.; Abeln, S.

2026-07-10 plant biology 10.64898/2026.07.09.737449 medRxiv

Top 0.1%

15.1%

Show abstract

While whole genome sequencing captures millions of single nucleotide polymorphisms (SNPs) and hyperspectral imaging (HSI) enables non destructive plant phenotyping, integrating these modalities to link genotype to phenotype remains challenging due to their high dimensionality and non linearity. This study presents DeepPheno a deep learning framework that predicts SNP genotypes from HSI data, using model predictability as a proxy for genotype phenotype association. HSI data were acquired from 194 lettuce genotypes under field conditions. HSI data patches (20 x 20 pixels x 224 spectral bands) were used to train a hybrid CNN to predict the variant of a specific SNP. The framework was validated on SNPs with known phenotypic effects (anthocyanin, leaf serration, pale pigmentation), achieving high predictive performance (AUC ranging from 0.806 to 0.935), whereas models trained on randomly shuffled labels performed at chance (mean AUC {approx} 0.51). Extending the workflow to 50 randomly selected putatively neutral SNPs, most yielded low predictability, but two showed high performance (AUC > 0.76), suggesting uncharacterized genotype phenotype links. Explainable AI, including SHAP and Grad CAM, identified relevant spectral and spatial features driving these predictions, particularly the green and red edge wavelengths associated with pigment dynamics and leaf structure. These results establish a framework for understanding complex genotype phenotype interactions in plants and extracting these links from HSI data without predefining the exact trait values. It provides an avenue for high throughput trait discovery and description and extends the integration of image based phenomics with plant genetics.

4

A multiregional image-text dataset and benchmark for vision-language modeling of plant diseases

Nguyen, T. V.; Quoc, K. N.; Harwath, D.; Quach, L.-D.; Dao, P. D.

2026-07-09 plant biology 10.64898/2026.07.01.735881 medRxiv

Top 0.1%

13.1%

Show abstract

Plant diseases remain a major challenge to global food production, and timely, accurate, and scalable detection of plant stress is critical to reducing these losses. Recent advances in digital imaging and artificial intelligence offer unprecedented opportunities for precision crop disease detection and management. Yet, existing plant disease datasets remain often fragmented across crop and disease systems, and are largely dominated by controlled-environment imagery. The lack of standardized, interoperable, and representative datasets limits reproducibility, transferability, and scalability of AI systems, thereby constraining their deployment in operational agricultural applications. Here we present LeafMD, an integrated multimodal plant disease dataset and benchmark resource that includes LeafNet 2.0, a large-scale multimodal digital image dataset comprising 255,855 image-text pairs across 37 crop species, 197 crop-disease classes, and 9 geographic regions spanning tropical, subtropical, and temperate agricultural systems. Unlike conventional datasets, LeafNet 2.0 integrates biologically grounded symptom descriptions with image-level annotations of early and late disease stages, enabling symptom-aware analysis of disease progression under realistic field conditions. We further introduce LeafBench 2.0 as part of LeafMD, a visual-question answering benchmark covering nine fine-grained plant pathology tasks, including pathogen classification, lesion characterization, symptom interpretation, and disease severity assessment. Evaluation across 16 vision-language models revealed substantial performance gaps between coarse disease recognition and fine-grained pathological reasoning, while agriculture-adapted models consistently outperformed several larger general-domain architectures on symptom-oriented tasks. Together, LeafNet 2.0 and LeafBench 2.0 establish LeafMD as a multimodal resource for developing disease-aware agricultural foundation models and studying fine-grained pathological reasoning in real-world environments.

5

Rootquant: Automated Root Trait Quantification Fromminirhizotron Images Using Deep Learning

Parth, K.; Varela, S.; Liu, Z.; Martini, K. M.; Rajurkar, A.; Allan, D.; McCoy, S.; Ruhter, J.; Walker, S.; Goldenfeld, N.; Leakey, A.

2026-07-08 plant biology 10.64898/2026.07.07.737053 medRxiv

Top 0.1%

13.1%

Show abstract

Quantifying root traits such as root length (RL) and root surface area (RSA) from minirhizotron imagery is a valuable approach for overcoming the phenotyping bottleneck that limits understanding and improvement of crop productivity, resource use efficiency and resilience in field experiments. However, current approaches remain labor-intensive, and deep learning (DL) methods suffer from limited generalization ability. We present RootQuant, an end-to-end DL model that simultaneously predicts RL and RSA directly from minirhizotron images using only whole-image trait values as supervision, thereby eliminating the need for pixel-level annotations. The models generalization ability was evaluated across species and fine-tuning configurations. The practical applicability of the model was further assessed under field conditions by converting image-derived RL estimates into volumetric root length density (vRLD). Using 118,191 maize and soybean images collected between 2009 and 2020, RootQuant trained on both species achieved an R2 of 0.90 and an RMSE of 2.9 mm for RL, and an R2 of 0.88 and an RMSE of 4.2 mm2 for RSA. The same mixed-species model generalized strongly across species, yielding an 8% relative improvement in R2 and a 30% lower RMSE on maize compared with the same architecture trained on a single species and applied zero-shot. Image-derived RL predictions converted to vRLD showed the expected depth-dependent decline in vRLD, as was also found by coincident destructive quantification of roots washed out of soil cores. By providing a generalist backbone model trained on a large dataset from two major crop species, RootQuant enables high-throughput simultaneous estimation of two relevant root traits directly from raw imagery without task-specific fine-tuning, thereby accelerating in situ root system analysis and phenotyping applications.

6

Text guidance is powerful but prompt-sensitive for weakly-supervised leaf symptom segmentation

Dubois, R.; Bousset, L.; Jumel, S.; Leclerc, M.; Parisey, N.; Joly, A.

2026-07-10 plant biology 10.64898/2026.07.10.737680 medRxiv

Top 0.1%

13.0%

Show abstract

Accurate segmentation of plant disease symptoms is essential for crop monitoring and phenotyping, yet it typically requires costly pixel-level annotations. Weakly supervised semantic segmentation (WSSS) alleviates this burden using image-level labels, but its performance depends on the quality of spatial priors such as class activation maps (CAMs). We investigate whether text-guided segmentation with the Segment Anything Model 3 (SAM3) can serve as an alternative weak supervision signal. Three pseudo-mask generation strategies are compared: (i) CAMs refined with SAM or SAM3, (ii) zero-shot text-guided SAM3, and (iii) a hybrid approach combining weak spatial cues with text prompts. The resulting pseudo-masks are used to train a DeepLabV3 model. Text guidance alone matches or outperforms conventional WSSS, achieving up to 0.46 IoU without spatial supervision and 0.61 IoU on a public dataset, although performance is sensitive to text prompt formulation. The hybrid strategy improves robustness, reaching 0.50 IoU on the primary dataset and 0.58 IoU on the additional dataset while reducing prompt sensitivity. Overall, text guidance is a promising alternative to conventional weak supervision, while hybrid approaches provide a more robust solution for plant disease segmentation.

7

SeedMeasure: an efficient approach and open-source program to quantify seed size

Sims, B.;Gaudinier, A.;Blackman, B.

2026-06-29 Plant Biology 10.64898/2026.06.27.734974 medRxiv

Top 0.1%

5.3%

Show abstract

PremiseSeed size and morphology are critical traits in agriculture, ecology, and genetics, but high-throughput quantification of these traits is often limited by labor-intensive manual measurements or expensive, platform-specific imaging software. Methods and ResultsWe developed SeedMeasure, a lightweight, open-source, and cross-platform command-line tool written in Python that automates the measurement of seed area, length, and width from images. Using a simple imaging setup, the program processes images by correcting for perspective skew, filtering debris, and exports quantitative data alongside quality-check images. We validated SeedMeasure across nine diverse species, ranging from small Arabidopsis thaliana seeds to large Zea mays kernels. The tool quickly handles images using multithreading and demonstrates high reproducibility, yielding low coefficients of variation across repeated runs. ConclusionsCompared to existing software, SeedMeasure is free, offers faster processing through parallel computing, and provides standalone executables that require no programming dependencies. SeedMeasure offers an accessible, cost-effective, and high-throughput approach for rapid phenotypic profiling, making advanced seed morphological analysis available to researchers without specialized laboratory hardware.

8

Knowledge-guided Bayesian optimization using pre-trained LLMs speeds up the identification of superior genotypes from germplasm collection

Hamazaki, K.; Tsuda, K.

2026-07-02 bioinformatics 10.64898/2026.06.28.735149 medRxiv

Top 0.1%

3.2%

Show abstract

Background: Germplasm collections contain wide genetic diversity that is valuable for plant breeding, but conducting phenotypic evaluation for all genotypes in field trials is rarely feasible. Bayesian optimization offers a way to decide, season by season, which genotypes to cultivate in order to identify superior genotypes with fewer evaluations. However, standard Bayesian optimization commonly starts from randomly selected genotypes and mainly relies on surrogate models built from marker genotype information, while the text-based passport information that accompanies germplasm is not fully used. We examined whether pre-trained large language models can provide prior knowledge that improves these decisions in germplasm evaluation. Results: We constructed a large-language-model-guided Bayesian optimization framework that introduces large language models into two parts of the Bayesian optimization workflow. In zero-shot warmstarting, a large language model proposes initial genotypes using passport information such as cultivar name, country of origin, and subpopulation, optionally together with principal component scores derived from genome-wide single-nucleotide-polymorphism markers. In addition, we evaluated a large-language-model-based surrogate model that predicts phenotypic values for untested genotypes using in-context learning from previously evaluated genotypes. Using a rice germplasm panel and two target traits (seed number per panicle for maximization and protein content for minimization), we compared strategies. For seed number per panicle, zero-shot warmstarting with a general-purpose instruction-following model reduced the number of evaluated genotypes needed to reach the best genotype, whereas improvements were small for protein content. When genomic information was available, Gaussian-process-based Bayesian optimization was the strongest overall approach, while the large-language-model-based surrogate model outperformed random baselines and was competitive in some settings. When genomic information was not available, predictions based on passport information improved efficiency compared with fully random strategies. Conclusions: Pre-trained large language models can inject useful agronomic knowledge into Bayesian optimization for germplasm evaluation, particularly by improving early-stage genotype selection, and can also support optimization when genomic information is unavailable. As models better handle long genomic sequences together with passport information, large-language-model-guided Bayesian optimization may become a practical and explainable decision-support approach for agricultural optimization.

9

VigExp: A functionally verified platform for aiding cowpea (Vigna unguiculata) and related legume crop improvement

Su, H.; Mazurkiewicz, D.; Gursanscky, N.; Riboni, M.; Juranic, M.; Johnson, S. D.; Yow, J. H.; Deo, J.; Liu, Y.; Mattinson, A.; Leon-Martinez, G.; Escobar-Guzman, R.; Salinas-Gamboa, R.; Amasende-Morales, I.; Vielle-Calzada, J.-P.; Koltunow, A. M. G.; Ferguson, B. J.

2026-07-09 plant biology 10.64898/2026.06.30.735734 medRxiv

Top 0.1%

3.1%

Show abstract

Legumes include some of the worlds most significant crop species, such as cowpea (Vigna unguiculata), a subsistence crop widely grown in sub-Saharan Africa. Despite their importance, legume crop improvement is hindered by a lack of high-resolution expression data, particularly for reproductive tissues and cell types. Here, we report on VigExp, a tool for visualising cowpea gene expression datasets. We demonstrate its utility across a range of vegetative and reproductive cell types of varieties IT97K-499-35 and IT86D-1010, which exhibit 93.75% protein sequence conservation and are amenable to stable transformation. This includes previously published transcriptomes of vegetative, floral and seed tissues, combined with developmentally staged male and female reproductive tissues. Also integrated are novel transcriptomes of laser-captured cell types covering reproductive development from meiosis to early embryo formation post-fertilisation. Spatial expression patterns and transcript levels can be visualised through an electronic fluorescent pictograph (eFP) browser. Validated by RT-qPCR, in situ hybridisation, transgenic, and CRISPR gene editing analyses, the predictive accuracy of VigExp matches prior cowpea functional study observations. Critical genes for nodule development and regulation were also identified and their expression patterns established in cowpea. Novel reference genes, constitutively expressed gene promoters for visualization makers/gene-editing, and tissue and cell specific gene promoters for targeting these regions, are identified. The A-type cyclin, VuTAM2, was also identified, with a critical role in male meiosis established. Collectively, VigExp represents an adaptable and updatable resource to support crop improvement in cowpea and other legumes, which are often highly syntenic with respect to genome composition.

10

Apical3DTip: Elliptic Cross-section-based Reconstruction for the Embryo Initial Cell of Arabidopsis

Nonoyama, T.; Kang, Z.; Hanaki, Y.; Itagaki, Y.; Matsumoto, H.; Kimata, Y.; Tsugawa, S.; Ueda, M.

2026-07-09 plant biology 10.64898/2026.06.25.734685 medRxiv

Top 0.1%

3.1%

Show abstract

BackgroundCell geometry plays a central role in determining division orientation and body axis formation during early embryogenesis in Arabidopsis thaliana. However, quantitative analysis of dynamic three-dimensional (3D) morphology remains challenging because live-imaging studies often rely on two-dimensional (2D) projections, while existing 3D reconstruction approaches, including mesh-based methods, often lose the original orientation information relative to the ovule and require labor-intensive mesh correction. In addition, embryo positional fluctuation caused by floating in liquid medium and continuous growth makes it difficult to analyze temporal morphological changes within a common coordinate system. ResultsWe developed a robust framework for quantitative 3D and four-dimensional (4D; 3D + time) analysis of embryo initial cell (apical cell) morphology. The method first establishes a standardized 3D coordinate system by normalizing cell orientation based on the bottom plane and the optical axis of the observation. Cell morphology is then reconstructed through ellipse-based approximation of serial cross-sections extracted from stacked imaging data, enabling accurate geometric characterization without the need for complex surface mesh reconstruction. To evaluate shape anisotropy, we quantified the apical cell shape in 3D. The framework further supports the characterization of volumetric features of subsequent division, providing a basis for quantifying 3D embryogenesis. ConclusionOur framework provides a simple and noise-reduced approach for quantitative analysis of living cell morphology in 3D. We named the integrated method of combining coordinate normalization with elliptical cross-section-based reconstruction Apical3DTip. This method enables consistent comparison of cell shapes without extensive manual corrections. The method overcomes key limitations of 2D projection-based and mesh-dependent analyses and offers a practical platform for quantifying cell shape and daughter cell shapes in 3D. More broadly, it provides a quantitative foundation for exploring the relationship between cell geometry, morphodynamics, and developmental patterning in living plant embryos.

11

High-throughput stomatal phenotyping provides selection targets for stress-resilient wheat

Mabrouk, M.; Russell, N. J.; Alegria, E. V.; Wang, T.-C.; Liang, J.-A.; Wu, F.-J.; Huang, Y.; Wittkop, B.; Snowdon, R.; Förter, L.; Moritz, A.; Herzog, E.; Ganji, E.; Wehner, G.; Stahl, A.; Chen, T.-W.

2026-07-13 plant biology 10.64898/2026.07.10.737162 medRxiv

Top 0.1%

3.1%

Show abstract

Phenotyping stomatal traits and their developmental plasticity is time-consuming but holds potential to improve water use efficiency and photosynthesis for designing stress-tolerant crops under climate change. Here, we develop a robust, high-throughput pipeline for phenotyping 14 stomatal traits in winter wheat related to size, variation, maximum conductance, and spatial patterning. We (1) analyze over 25,000 images from 60 wheat cultivars grown in growth chamber, greenhouse, and field conditions; (2) investigate the impact of light, temperature, and reduced water and nitrogen supply on stomatal traits and their developmental plasticity across adaxial and abaxial surfaces; and (3) evaluate genetic diversity and breeding progress of stomatal traits. Stomatal traits were highly broad-sense heritable, were largely plastic in response to environmental conditions, and showed genotype-specific responses. Stomatal traits of third leaves under controlled environments with stable light and temperature conditions reliably captured the genetic variance of flag leaves under field conditions. Our data suggests that the upper leaf surface contributed more to transpiration and cooling through consistently higher stomatal density, area, and maximum conductance, while the lower surface facilitated CO2 diffusion via systematic proper patterning and spacing. Breeding maintains the genetic diversity of stomatal traits, and our pipeline facilitates breeders to target them to enhance water use efficiency in high-yielding modern cultivars.

12

Self-Calibrated Hyperspectral Neural Radiance Fields for 3D Reconstruction of Bone and Bone Analogues

Sigger, N.; Nguyen, T. T.; Ashraf, S.; Tozzi, G.

2026-06-23 bioengineering 10.64898/2026.06.17.732938 medRxiv

Top 0.1%

2.6%

Show abstract

Hyperspectral imaging (HSI) has gained increasing attention for bone assessment because it captures rich wavelength dependent information associated with mineralised tissue. HSI provides detailed spectral information related to material composition, while 3D geometric information supports the analysis of surface morphology and structural detail. However, integrating spectral and geometric information remains challenging, particularly when conventional reconstruction pipelines depend on external pose estimation. To address this challenge, we propose BoNeRF-HS, a self-calibrated hyperspectral neural radiance field for 3D reconstruction. BoNeRF-HS jointly optimises camera intrinsics, volume density, and hyperspectral radiance, removing the need for COLMAP based poses. To improve spectral modelling, we incorporate a gated spectral adapter head that learns wavelength dependent radiance features for hyperspectral view synthesis. We evaluate BoNeRF-HS on a multi-view hyperspectral dataset containing mouse bone, trabecular bone analogue, and cortical bone analogue samples. Experimental results demonstrate that our framework achieves improved reconstruction quality, and better preservation of bone surface details compared with existing approaches.

13

PolliCrop: A high-throughput computer vision pipeline for pollinator monitoring in agroecosystems

Chabert, S.; Bernigaud-Samatan, J.; Blackman, B. K.; Blanchet, N.; Catrice, O.; Donnadieu, C.; Gani, M.; Grousset, R.; Husband, S.; Tueux, G.; Erler, S.; Langlade, N. B.

2026-07-13 animal behavior and cognition 10.64898/2026.07.08.737348 medRxiv

Top 0.1%

1.7%

Show abstract

Flower-visiting insect populations are declining since the 1990s, especially because of the decrease of floral resources in agricultural settings. Mass flowering crops can help increase resource availability, and plant breeding can be directed towards selecting varieties attracting more flower-visiting insects. This requires the implementation of an automated high-throughput phenotyping tool for assessing the attractiveness of plant genotypes to flower-visiting insects. In this study, (i) we present a procedure to take standardized images of sunflower heads with camera traps continuously at day and night in the field; (ii) we trained two versions of a deep learning model, named PolliCrop, to automatically detect and identify three classes of the main insects visiting sunflower on these images (non-Bombus bees, bumble bees, lepidopterans); (iii) we assessed and validated the ability of PolliCrop to correctly predict the true visitation frequencies of the insect classes on three sunflower genotypes; (iv) we presented two statistical approaches to compare the insect visitation frequencies between plant genotypes, one including weather variables, and the other one without. One PolliCrop version yielded satisfying performance to correctly detect the three insect classes. In particular, it correctly predicted the insect visitation frequencies on two sunflower genotypes in a range of {+/-}10%. The other PolliCrop version can be useful in certain contexts of images and objectives. PolliCrop can be extended in the future to other crop species by training PolliCrop on new images captured in these crops. The field experimental design to set up for comparing the attractiveness between genotypes is also discussed.

14

A genetic toolkit to reduce wheat immunogenicity and incidence of celiac disease

Rottersman, M. G.; Laudencia-Chingcuanco, D.; Zhang, W.; Guzman-Lopez, M. H.; Lin, J. W.; Zhang, J.; Caseys, C.; Burguener, G.; Kim, S.; Zhang, X.; Yunusbaev, U.; Akhunov, E.; Lee, J.-Y.; Dubcovsky, J.

2026-07-08 plant biology 10.64898/2026.06.23.734071 medRxiv

Top 0.1%

1.7%

Show abstract

Celiac disease (CeD) is an immune-mediated condition triggered by wheat gluten in genetically predisposed individuals. The immune reaction in people with CeD is driven by particular gluten amino acid sequences, or immunogenic epitopes. Some of these epitopes elicit strong immune responses in the majority of CeD patients and are designated as immunodominant epitopes. Previous research has shown correlations between the amount of immunogenic wheat epitopes consumed and the onset of CeD, suggesting that reducing wheat immunogenic epitopes may reduce CeD incidence at the population level. Gluten consists of gliadins and glutenins, with gliadins having the majority of the immunodominant epitopes and glutenins playing a major role in dough strength and breadmaking quality (BMQ). This study used radiation-induced deletions, chemical mutagenesis, and natural variation in wheat (Triticum aestivum) to generate genetic stocks with reduced immunogenic epitope content. Most lines were developed in the wheat cultivar Summit, for which we produced a full genome assembly and annotation. We used exome capture to characterize these deletions and identify prolamins located within and outside the deletions. We combined different deletions and developed molecular markers to facilitate their deployment. For chromosome arms 1BS and 1DS, we generated two alternative lines: one lacking immunogenic epitopes for the development of CeD-safe genetic stocks for research purposes, and another retaining selected glutenins for breeding commercial lines with reduced immunogenicity and adequate BMQ. By making these non-transgenic genetic stocks publicly available, we aim to accelerate the development of wheat varieties with reduced immunogenicity and, eventually, a fully CeD-safe wheat.

15

Multi-trait evaluation of a tomato MAGIC population identifies promising lines with improved nitrogen use efficiency (NUE)

Baraja-Fonseca, V.; Gil-Villar, D.; Bancic, J.; Renau-Morata, B.; Salud Justamante, M.; Plazas, M.; Gramazio, P.; Vilanova, S.; Perez-Perez, J. M.; Granell, A.; Molina, R. V.; Nebauer, S. G.; Prohens, J.; Arrones, A.

2026-07-15 plant biology 10.64898/2026.07.14.738388 medRxiv

Top 0.2%

1.4%

Show abstract

Nitrogen-use efficiency (NUE) is a pivotal breeding target in tomato (Solanum lycopersicum L.) to sustain production under reduced N inputs. Here, we leveraged a recently developed tomato multi-parent advanced generation inter-cross (ToMAGIC) population to identify lines with superior performance under reduced N availability. The eight founders and a core subset of 118 ToMAGIC lines were characterized with 10,684 SNP markers and evaluated under optimal (opN, 15 mM) and suboptimal (subN, 8 mM) N supply in an experiment totalling 1,576 plants, generating 48,068 data points across 61 phenotypic variables. Under both N treatments, ToMAGIC lines exhibited transgressive segregation for most traits, confirming the value of this population as a reservoir of untapped variation. Notably, under subN conditions, harvest index (Hi) increased by 29-44%, suggesting adaptive resource redistribution toward reproductive sinks. Variance partitioning revealed that agronomic and NUE-related traits were largely under genetic control, with heritability estimates frequently above 0.80 and broadly conserved across N treatments. Multivariate trait analysis identified fruit yield N concentration (NUE component, CN,y), shoot biomass N content (NAb), and shoot growth-related traits as the main drivers of treatment differentiation. Finally, proxy traits were prioritized by integrating response magnitude, heritability, trait correlations, and treatment-discriminatory power into multi-trait selection indices. This strategy generated favorable predicted genetic gains, reaching 158% for high-performance lines and 170% for subN-adapted lines, and consistently identified lines 402, 428, 518, 800, and 816 as promising pre-breeding materials. Overall, this study supports ToMAGIC as a powerful resource for developing N-efficient cultivars suited for sustainable agriculture.

16

From Phenomics to Genomics: Macro-GWAS of Almond Morphology and Quality

Mas Gomez, J.; Rubio Angulo, M.; Duval, H.; Dicenta, F.; Martinez-Garcia, P. J.

2026-07-07 plant biology 10.64898/2026.07.06.736816 medRxiv

Top 0.2%

1.1%

Show abstract

In plant breeding and genetics, recent advances in high-throughput phenotyping are beginning to meet the growing demand for large-scale, high-quality phenotypic data that emerged after the development of next-generation sequencing technologies. Recent developments in phenomics have been incorporated into almond breeding programs, facilitating the large-scale acquisition of quantitative phenotypes and the dissection of the genetic architecture underlying morphological and quality-related traits. The implementation of a high-throughput phenotyping platform integrating RGB and hyperspectral imaging with genotyping using the 60K almond SNP array enabled the large-scale characterization of almond populations and the identification of 567 robust marker-trait associations across 66 traits. These analyses revealed two major genomic hotspots on chromosomes 2 and 5 associated with morphological and quality-related traits. These regions harbored biologically relevant candidate genes, including genes associated with OVATE family proteins, brassinosteroid signaling, protein ubiquitination, and acyl-CoA metabolism, as well as other regulators of organ growth, cell proliferation, hormone signaling, and seed development. Furthermore, a novel candidate gene encoding a COMT-like O-methyltransferase involved in lignin biosynthesis was identified and proposed to contribute to shell hardness, a major genetically controlled trait in almond. Together, these findings demonstrate the potential of integrating high-throughput phenomics and genomics to dissect complex traits, identify candidate genes, and accelerate genomics-informed breeding in almond.

17

Amplification-free CRISPR/Cas13a-based viroid detection in RNA extracts from infected plants

Le, L. T. T.; Montagud-Martinez, R.; Rodrigo, G.; Daros, J.-A.

2026-07-09 plant biology 10.64898/2026.07.02.736049 medRxiv

Top 0.2%

0.9%

Show abstract

Viroids are plant infectious agents that threaten agricultural production. Current viroid detection methods rely on RT-PCR-based assays, which require specialized laboratory equipment and can sometimes produce false-negative results or non-specific amplification due to the high sequence conservation among closely related viroid species. CRISPR-based diagnostics, particularly Cas12-based systems for DNA detection (DETECTR) and Cas13a-based systems (SHERLOCK) for RNA detection, have emerged as powerful tools for nucleic acid diagnostics. However, most existing workflows still rely on target amplification and, in the case of Cas13a systems, require additional in vitro transcription steps, limiting their simplicity and direct applicability for plant diagnostics. Here, we developed a direct amplification-free Cas13a-based detection platform for viroids using potato spindle tuber viroid (PSTVd) as a model. We optimized CRISPR RNA (crRNA) design, identified inhibitory effects of plant total RNA on readout signal, and employed simplified viroid RNA enrichment workflows enabling robust detection in plant samples. The system further supported both PSTVd-specific and broad-spectrum pospiviroid (genus Pospiviroid) detection and was successfully extended to avocado sunblotch viroid (family Avsunviroidae), demonstrating its adaptability across distinct viroid families. Together, these results establish a practical and modular Cas13a-based platform, not only for viroid diagnostics, but also for broader applications in RNA-derived plant pathogen detection. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=68 SRC="FIGDIR/small/736049v1_ufig1.gif" ALT="Figure 1"> View larger version (18K): org.highwire.dtl.DTLVardef@1d04170org.highwire.dtl.DTLVardef@1783aa3org.highwire.dtl.DTLVardef@51baa7org.highwire.dtl.DTLVardef@1b542b9_HPS_FORMAT_FIGEXP M_FIG C_FIG Significance statementA simplified RNA enrichment workflow combined with CRISPR-Cas13a enables direct, amplification-free detection of plant viroids. The assay supports early and reliable diagnosis across different tomato varieties and provides a practical strategy for improving molecular detection of plant pathogens.

18

Haplotypes variations of yellow stripe like (TaYSL) genes are associated with grain iron and zinc contents in wheat (Triticum aestivum L.)

Abbasi, K.; Qayyum, H.; Naseer, S.; Sun, M.; Quraishi, M. A.; Danyal, Y.; Hao, Y.; He, Z.; Rasheed, A.

2026-07-08 plant biology 10.64898/2026.06.17.732851 medRxiv

Top 0.2%

0.9%

Show abstract

The availability of pangenome and resequencing of wheat collections have facilitated the discovery of gene-trait associations in wheat. Yellow stripe-like (YSL) proteins play a key role in the uptake and translocation of metals and yet have not been fully identified and analyzed at the genome-wide level in wheat. In this study, 26 TaYSL genes were identified and divided into four distinct clades, each clade sharing similar domains and motif compositions. Most genes were upregulated under iron deficiency, whereas homoeologs of TaYSL1 were downregulated. Both SNP-based and haplotype-based association studies were used to dissect the role of TaYSLs underpinning grain iron contents (GFeC) and zinc contents (GZnC) in wheat. TaYSL6-2B and TaYSL16-1A haplotypes showed strong association with GFeC, and TaYSL14-6A showed strong association with GZnC in multiple field trials. The distribution of favorable haplotypes in global wheat collection of [~]3000 accessions showed that majority of haplotypes were more prevalent in landraces and winter wheat compared to modern cultivars and spring types, indicating their potential for use in breeding. The combination of favorable haplotypes of three YSL genes associated with GFeC and GZnC were very rare, and most of the wheat accessions has single or double favorable haplotypes. These findings provide the first comprehensive characterization of the TaYSL gene family in wheat and identify significant SNPs and elite haplotypes that can be utilized for genetic improvement and biofortification.

19

Introducing PHJ Media: A Unique Machine Learning -Driven Basal Formulation to Overcome Recalcitrance for Multi-Genotype Micropropagation of Cannabis sativa L.

Pepe, M.; Hesami, M.; Jones, M.

2026-07-15 plant biology 10.64898/2026.07.14.738465 medRxiv

Top 0.2%

0.9%

Show abstract

Applications of tissue culture are critical for Cannabis sativa L. (cannabis), supporting clonal propagation, germplasm preservation, pathogen elimination, among other biotechnological applications. However, extensive genetic diversity associated with cannabis results in highly variable responses to in vitro conditioning, and no consensus basal media formulation exists to support reproducible micropropagation across genotypes. To address these limitations, a hybridized ensemble-NSGA-II approach was employed for concurrent optimization of individual media components to create a species specific, cultivar inclusive basal salt formulation for cannabis micropropagation. The resulting PHJ media represents a unique formulation that overcomes recalcitrance across a wide array of cannabis cultivars, facilitating improved growth and uniformity for the nine cultivars used in its development and validation. These results remain consistent from explant initiation through multiple rounds of subculture. The ability of PHJ to overcome genotypic recalcitrance is telling of its potential applicability with an array of plant species beyond cannabis. Additionally, robust performance both with and without plant growth regulators underscores the plausible use of PHJ for diverse applications beyond standard micropropagation. Ultimately, this cultivar-inclusive basal medium demonstrates utility for both scientific research and industrial-scale operations.

20

An in vitro regeneration system with efficient rooting in sweet orange (Citrus sinensis) supports recovery of transgenic plants

Datta, J.; Bhowmik, S. D.; Williams, B.; Kerr, S. C.

2026-07-08 plant biology 10.64898/2026.06.16.732047 medRxiv

Top 0.2%

0.8%

Show abstract

In vitro regeneration of Citrus plants is a widely used method, however, induction of adventitious roots from regenerated shoots remains a major bottleneck, limiting the recovery of healthy plants for commercial production and genomic research for crop improvement. We established an in vitro regeneration system producing profuse, healthy roots for sweet orange (Citrus sinensis cv. Benyenda) by optimising combinations and concentrations of auxins. Prior to optimising the rooting media (RTMs), we obtained a shoot regeneration rate of 90.6% from sweet orange epicotyl explants using a cytokinin, 6-benzylaminopurine (BAP). Across twelve auxin-supplemented RTMs containing different concentrations of indole-3-butyric acid (IBA) and/or 1-naphthaleneacetic acid (NAA), rooting percentages ranged from 8 - 87.5%. The combination of IBA 1.0 mg L-1 and NAA 0.1 mg L-1 promoted the best overall performance, 75 {+/-} 7.2% rooting percentage with healthy, callus-free roots ([≥]5 cm in length), whereas other RTMs with other auxin combinations induced callus and limited root elongation. The best-performing SRM and RTM were subsequently used for selection and recovery of transgenic sweet orange lines carrying an empty CRISPR/Cas9 construct, resulting in an 4.8% transformation efficiency. Both transgenic and non-transgenic rooted plantlets were successfully acclimatised under glasshouse conditions with a survival rate of 90%. This enhanced regeneration system overcomes rooting bottleneck and improves plant survival,enabling faster recovery of transgenic citrus lines within four months. It supports accelerated development for commercial applications and advances in citrus genetic improvement.